Integrating Time Alignment and Neural Networks for High Performance Continuous Speech Recognition
نویسندگان
چکیده
Successful application of existing connectionist methods to continuous speech recognition requires the use or time-alignment procedures. These procedures. usually based on dynamic programming, provide means for supervising the training of neural networks. This paper describes two systems in which neural network classifiers are merged with dynamic programming (DP) time alignment methods to produce high performance continuous speech recognizers. One system uses the Connectionist Viterbi Training (CVT) procedure, in which a neural network with frame-level outputs is trained using guidance from a time alignment procedure. The other system uses Multi-State Time Delay Neural Networks (MS-TDNNs). in which embedded DP time alignment allows network training with only word-level external supervision. CVT has been described previously [l] ; only changes lo the system and new results on the TI Digits task are reported here. The newest CVT results on the TI Digits are 99.1% word accuracy and 98.0% string accuracy. MSTDNNs, introduced in this paper, are described in more detail here, with attcntion focused on their basic architecture, the training procedure, and results of applying MS-TDNNs to continuous speakerdependent alphabet recognition: on two speakers, word accuracy is respectively 97.5% and 89.7%.
منابع مشابه
شبکه عصبی پیچشی با پنجرههای قابل تطبیق برای بازشناسی گفتار
Although, speech recognition systems are widely used and their accuracies are continuously increased, there is a considerable performance gap between their accuracies and human recognition ability. This is partially due to high speaker variations in speech signal. Deep neural networks are among the best tools for acoustic modeling. Recently, using hybrid deep neural network and hidden Markov mo...
متن کاملPersian Phone Recognition Using Acoustic Landmarks and Neural Network-based variability compensation methods
Speech recognition is a subfield of artificial intelligence that develops technologies to convert speech utterance into transcription. So far, various methods such as hidden Markov models and artificial neural networks have been used to develop speech recognition systems. In most of these systems, the speech signal frames are processed uniformly, while the information is not evenly distributed ...
متن کاملNeural Network Based Recognition System Integrating Feature Extraction and Classification for English Handwritten
Handwriting recognition has been one of the active and challenging research areas in the field of image processing and pattern recognition. It has numerous applications that includes, reading aid for blind, bank cheques and conversion of any hand written document into structural text form. Neural Network (NN) with its inherent learning ability offers promising solutions for handwritten characte...
متن کاملمعرفی شبکه های عصبی پیمانه ای عمیق با ساختار فضایی-زمانی دوگانه جهت بهبود بازشناسی گفتار پیوسته فارسی
In this article, growable deep modular neural networks for continuous speech recognition are introduced. These networks can be grown to implement the spatio-temporal information of the frame sequences at their input layer as well as their labels at the output layer at the same time. The trained neural network with such double spatio-temporal association structure can learn the phonetic sequence...
متن کاملScaly Neural Networks for Speech Recognition Using DTW and Time Alignment Algorithms
Speech recognition has been an active research topic for more than 50 years. Interacting with the computer through speech is one of the active scientific research fields particularly for the disable community who face variety of difficulties to use the computer. Such research in Automatic Speech Recognition (ASR) is investigated for different languages because each language has its specific fea...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2004